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DETAILED ACTION 

Remarks 

1 . In view of the Pre-Appeal Brief Request filed on 04/1 5/2008, PROSECUTION IS 
HEREBY REOPENED. A new ground of rejection is set forth below. 

To avoid abandonment of the application, appellant must exercise one of the 
following two options: 

(1 ) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply 
under 37 CFR 1 .1 1 3 (if this Office action is final); or, 

(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41 .31 
followed by an appeal brief under 37 CFR 41 .37. The previously paid notice 
of appeal fee and appeal brief fee can be applied to the new appeal. If, 
however, the appeal fees set forth in 37 CFR 41 .20 have been increased 
since they were previously paid, then appellant must pay the difference 
between the increased fees and the amount previously paid. 

A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by 
signing below: 

2. The claims 4, 9, 15, 18 and 20 have been amended. 

3. The 35 U.S.C. § 112 second paragraph rejection to claims 4, 5, 9, 10, 15, 16, 18 
and 19 is withdrawn in view of Applicant's amendment 

4. Claims 1-20 remain pending and have been examined. 
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Drawings 

5. The replacement drawings filed on December 1 9, 2007 are accepted by the 
Examiner. 

Response to Arguments 

6. Applicant's arguments filed on 04/15/2008, in particular on pages 6-16 of the 
Appellants' Appeal Brief, has been fully considered. 

■ At page 6, the fifth and sixth paragraphs of section VII Arguments, the 
Applicants submit that the latest rejection fails to address the difference 
between the current application and prior art reference. Because "in the 
present invention, the L1 cache is used for the matrix data transfer between 
main memory and the FPUs" (see for example, the fifth paragraph of page 6). 
However, it should be noted that claim language does not recite any L1 
limitation in claim 1. 

■ At page 6, last paragraph, the Applicant argues that prior art method in 
Nakazawa provides a hardware solution and the present invention provides a 
general software solution. The Applicant further points out at first paragraph 
of page 7 that Nakazawa would work only on 1995 hardware (now-obsolete). 
However, the Examiner's position is that the basic methods/ideas of both prior 
art reference and present application about preloading data to FPU to 
improve efficiency and speed in executing a linear algebra subroutine are the 
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same. It is obvious that said method/idea can be implemented/realized using 
either hardware solution or using software solution. 

■ At pages 11-14, the applicant submits that the rejection of record does not 
establish a reasonable rational to modify Nakazawa using the rationale that 
Dhablania provides a known improvement of the technique described in 
Nakazawa. However, the Examiner's position is that the prior art reference 
Nakazawa discloses a method to preload the data from memory/cache to 
floating point register before executing a calculation (see for example, col. 8, 
lines 21-23, "An element to be calculated at the i-th loop is load at the loop 
before the i-th loop. The prior art reference Dhablania discloses the detail 
information about how to preload data (see for example, col.4, lines21-26, 
"The FPU 70 includes a load/store stage with 4-deep load and store queues"). 
Therefore, Dhablania's preload method can be incorporated in Nakazawa 
method to preload data in to the floating point register before executing the 
calculation. It is also obvious that said combination can be 
implemented/realized by using either hardware or software implementation. 

■ At pages 15-16, the Applicant argues that prior art reference Nakazawa. 
Dhablania and Dongarra do not teach the limitation as cited in claims 4, 5 and 
18. The Examiner thanks the Applicant pointing out the difference between 
the prior arts and present application. However, the claim language e.g. Claim 
4 does not recite the limitation L1 BLAS and L2 BLAS as the Applicant 
argued. 
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Claim Rejections - 35 USC §112 

7. The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

8. Claim 20 is rejected under 35 U.S.C. 112, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which 
applicant regards as the invention. 

Claim 20: 

The term "n cycles" in claim 20 is a relative term which renders the claim 
indefinite. The term "n cycles" is not defined by the claim, the specification does 
not provide a standard for ascertaining the requisite degree, and one of ordinary 
skill in the art would not be reasonably apprised of the scope of the invention. 
For the purpose of compact prosecution, the examiner treats "n cycles" as -one 
or more cycles--. 



Claim Rejections - 35 USC § 103 

9. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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10. Claims 1, 2 and 20 are rejected under 35 U.S.C. 102(b) as being anticipated by 
Nakazawa (Nakazawa et al., US 5,438,669) in view of Dhablania (US 6,1 15,730) 
in further view of Mulla (Mulla et al., UD 6,507,892) 
Claim 1: 

Nakazawa discloses a software method of executing a linear algebra subroutine, 
said method comprising: for an execution code controlling operation of a floating 
point unit (FPU) performing said linear algebra subroutine execution, using 
preload instruction to preload data into a floating point register (FRegs) of said 
FPU. (see for example, Fig. 3, element 105, "Physical Floating Point Register 
Group", element 106 "Floating Point Calculator", element 102 "Instruction 
Controller" and related; Also see, Fig.4B, 4C "Floating Point Register Preload 
Instruction", "Extended Floating Point Register Preload Instruction" and related 
text"; Further see, col. 7, lines 2-1 1 , "the program by the loop unrolling method 
requires four floating point registers and one general register for vector data 
storage...") 

But Nakazawa does not explicitly disclose the detailed method about overlapping 
by preloading data. However, Dhablania in the same analogous art of reloadable 
floating point unit, discloses a software method of improving at least one of 
efficiency and speed in executing a linear algebra subroutine on a computer 
having a floating point unit (FPU) and a load/store unit (LSU) capable of 
overlapping loading data and processing of said data by the FPU, said method 
comprising: 
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■ For an execution code controlling operation of said linear algebra subroutine 
execution, overlapping by preloading data into a floating point registers 
(Fregs) of said FPU, said overlapping causing data to arrive into said Fregs to 
be timely executed by the FPU operations of said linear algebra subroutine on 
said FPU (see for example, Fig.4a, 4b and related text; also see col.1 , section 
"Summary of the invention", "ability to initiate a next instruction held in a 4- 
deep instruction queue before a prior instruction has finished"; col.4, lines 21- 
26, "The FPU 70 includes a load/store stage with 4-deep load and store 
queues") 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to use the method disclosed by Nakazawa and 
Dhablania to improve the performance of an FPU by providing it with preload 
registers which enable initiation of a next instruction held in a instruction queue 
as suggested by Dhablania (see for example, col.1 , Summary of the invention). 
Nakazawa and Dhablania disclose using cache/unified cache to store data for 
transferring to the floating point register, but do not explicitly disclose said caches 
are L1 cache. However, Mulla in the same analogous art of L1 cache memory 
discloses using multi-level hierarchy of memory including L1 cache to improve 
the performance (see for example, col.1, lines 32- col. 2, lines 18). Therefore, it 
would have been obvious to one having ordinary skill in the art at the time the 
invention was made to use L1 cache instead of direct accessing main memory to 
optimize access time for cache hits to further improve the performance of the 
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computer system as suggest by Mulla (see for example, col .2, lines 15-18) 
Claim 2: 

Nakazawa , Dhablania and Mulla disclose the method of claim 1 , Nakazawa 
further discloses wherein said instructions are unrolled repeatedly until the data 
loading reaches a steady state in which a data loading exceeds a data 
consumption (see for example, col.5, lines 23-28, "With this loop unrolling 
method, a plurality of elements (=n) are processed in one loop, this loop unrolling 
method has 1/n the number of loops required by the conventional method", also 
see Fig.1 1 and 12 for unrolling results and related text). 

Claim 20: 

Nakazawa , Dhablania and Mulla disclose the method of Claim 1 , Dhablania 
further discloses wherein said FPU comprises said Fregs as interfaced with an 
L1 cache, said interface having a penalty of n cycles, said preloading eliminating 
this n-cycle penalty (see for example, Figla, element 60, 29 "Unified Cache", 
"Write Buffer"; also see col. 10, lines 58-67, "eliminate a full cycle form time 
period" and related text) 

Claim 17: 

Nakazawa discloses a method of providing a service involving at least one of 
solving and applying a scientific/engineering problem, said method comprising at 
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least one of: using a linear algebra software package that computes one or more 
matrix subroutines, wherein said linear algebra software package generates an 
execution code controlling a load/store unit loading data into a floating point 
register (FReg) for a floating point unit (FPU) performing a linear algebra 
subroutine execution, such that, for an execution code controlling operation of 
said FPU, an instruction is unrolled to cause a preloading of data into said FReg. 
(see for example, Fig. 3, element 105, "Physical Floating Point Register Group", 
element 106 "Floating Point Calculator", element 102 "Instruction Controller" and 
related; Also see, Fig.4B, 4C "Floating Point Register Preload Instruction", 
"Extended Floating Point Register Preload Instruction" and related text"; Further 
see, col. 7, lines 2-1 1 , "the program by the loop unrolling method requires four 
floating point registers and one general register for vector data storage..."); 
But Nakazawa does not explicitly disclose the detailed method about overlapping 
by preloading data. However, Dhablania in the same analogous art of reloadable 
floating point unit, discloses a software method of improving at least one of 
efficiency and speed in executing a linear algebra subroutine on a computer 
having a floating point unit (FPU) and a load/store unit (LSU) capable of 
overlapping loading data and processing of said FPU data by the FPU, said 
method comprising: 

■ For an execution code controlling operation of said linear algebra subroutine 
execution, overlapping by preloading data into a floating point registers 
(Fregs) of said FPU, said overlapping causing data to arrive into said Fregs to 
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be timely executed by the FPU operations of said linear algebra subroutine on 
said FPU (see for example, Fig.4a,4b and related text; also see col.1, section 
"Summary of the invention", "ability to initiate a next instruction held in a 4- 
deep instruction queue before a prior instruction has finished"; col.4, lines 21- 
26, "The FPU 70 includes a load/store stage with 4-deep load and store 
queues") 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time the invention was made to use the method disclosed by Nakazawa and 
Dhablania to improve the performance of an FPU by providing it with preload 
registers which enable initiation of a next instruction held in a instruction queue 
as suggested by Dhablania (see for example, col . 1 , Summary of the invention). 
Dhablania also discloses providing a consultation for purpose of solving a 
scientific/engineering problem using said linear algebra software package (see 
for example, col.1 , section "Summary of the invention"); 

But neither of them explicitly discloses transmitting a result of said linear algebra 
software package on at least one of a network, a signal-bearing medium 
containing machine-readable data representing said result, and a printed version 
representing said result; and receiving a result of said linear algebra software 
package on at least one of a network, a signal-bearing medium containing 
machine-readable data representing said result, and a printed version 
representing said result. However, it is well known in the computer the result 
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(data) of said executing linear algebra software package can be transmitted, 
stored and printed. Thus, it also would have been obvious. 

1 1 . Claims 3-1 6, 1 8 and 1 9 are rejected under 35 U.S.C. 1 03(a) as being 

unpatentable over Nakazawa (Nakazawa et al., US 5,438,669) in view of 
Dhablania (US 6,1 15,730) and further in view of Donqarra (Donqarra et al., "A 
Set of Level 3 Basic Linear Algebra Subprograms") 
Claim 3: 

Nakazawa and Dhablania disclose the method of claim 1 , but neither of them 
explicitly discloses wherein said linear algebra subroutine comprises a matrix 
multiplication operation. However, Donqarra in the same analogous art of 
implementation of Level 3 Basic Linear Algebra Subprograms discloses matrix 
multiplication operation (matrix- multiply) (see for example, p.1 1 , line 15, "matrix- 
multiply routine"). Therefore, it would have been obvious to one having ordinary 
skill in the art at the time the invention was made to use Nakazawa 's calculator to 
do matrix multiplication operation. One would have been motivated to do so to 
improve efficiency and parallel processing capability as suggested by Donqarra 
(see for example, p.1 , abstract portion, lines 1-4, "The Level 3 BLAS are targeted 
at matrix-matrix operations, with the aim of providing more efficient, but portable, 
implementations of algorithms on high-performance computers, especially those 
with hierarchical memory and parallel processing capability.") 
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Claim 4: 

Nakazawa and Dhablania disclose the method of claim 1 , but do not explicitly 
disclose wherein said linear algebra subroutine comprises a subroutine 
equivalent to a LAPACK (Linear Algebra PACKage) subroutine, as modified in 
accordance with claim 1 . However, Donqarra in the same analogous art of linear 
algebra discloses LAPACK (LINPACK) (see for example, p.1, Introduction, "The 
original basic linear algebra subprograms... have been used in a wide range of 
software including LINPACK [13]..."). Therefore, it would have been obvious to 
one having ordinary skill in the art at the time the invention was made to use 
existing routine defined or implemented by LINAPACK. One would have been 
motivated to do so to greatly simplify the implementation of the infrastructure as 
suggested by Donqarra (see for example, p. 1-2, Introduction "In particular, they 
are an aid to clarity, portability, modularity, and maintenance of software; and 
they have become a de facto standard for the elementary vector operations.") 

Claim 5 

Nakazawa , Dhablania and Donqarra disclose the method of claim 4, Donqarra 
further discloses said LINPACK subroutine comprises a BLAS Level 3 L1 cache 
kernel (see for example, p. 2, Introduction, "For example, no routines are included 
for matrix factorization; these are currently provided by LINPACK and will be 
included in a new linear algebra package currently under development..."). 
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Claim 18: 

Nakazawa and Dhablania disclose the method of claim 1 7, but neither of them 
explicitly discloses wherein said linear algebra subroutine comprises a subroutine 
from a LAPACK (Linear Algebra PACKage). However, Donqarra in the same 
analogous art of linear algebra discloses LAPACK (LINPACK) (see for example, 
p.1, Introduction, "The original basic linear algebra subprograms... have been 
used in a wide range of software including LINPACK [13]..."). Therefore, it would 
have been obvious to one having ordinary skill in the art at the time the invention 
was made to use existing routine defined or implemented by LINAPACK. One 
would have been motivated to do so to greatly simplify the implementation of the 
infrastructure as suggested by Donqarra (see for example, p. 1-2, Introduction "In 
particular, they are an aid to clarity, portability, modularity, and maintenance of 
software; and they have become a de facto standard for the elementary vector 
operations.") 

Claim 19: 

Nakazawa , Dhablania and Donqarra disclose the method of claim 1 8, Donqarra 
further discloses said LINPACK subroutine comprises a BLAS Level 3 L1 cache 
kernel (see for example, p. 2, Introduction, "For example, no routines are included 
for matrix factorization; these are currently provided by LINPACK and will be 
included in a new linear algebra package currently under development..."). 
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Claims 6-11: 

Claims 6-1 1 are an apparatus version of claimed method, wherein all claimed 
limitations have been address and/or set forth above in claims 1-5. Therefore, as 
the references teach all the limitation of claims 1-5, they also teach the limitations 
of claims 6-1 1 respectively. Thus, they also would have been obvious. 



Claims 12-16: 

Claims 12-16 are a software program product version of claimed method, 
wherein all claimed limitations have been address and/or set forth above in 
claims 1-5. Therefore, as the references teach all the limitation of claims 1-5, 
they also teach the limitations of claims 12-16 respectively. Thus, they also would 
have been obvious. 



Conclusion 

12. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

1 3. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Zheng Wei whose telephone number is (571) 
270-1 059 and Fax number is (571 ) 270-2059. The examiner can normally be 
reached on Monday-Thursday 8:00-15:00. 
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If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Tuan Q. Dam can be reached on (571) 272-3695. The 
fax phone number for the organization where this application or proceeding is 
assigned is 571-273-8300. 

Any inquiry of a general nature of relating to the status of this application 
or proceeding should be directed to the TC 2100 Group receptionist whose 
telephone number is 571- 272-1000. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). If you would like assistance from a USPTO Customer Service 
Representative or access to the automated information system, call 800-786- 
9199 (IN USA OR CANADA) or 571-272-1000. 

/Z. W./ 

Examiner, Art Unit 2192 
/Tuan Q. Dam/ 

Supervisory Patent Examiner, Art Unit 2192 



